The search functionality is under construction.
The search functionality is under construction.

Keyword Search Result

[Keyword] Hidden Markov Model(71hit)

21-40hit(71hit)

  • Binary Oriented Vulnerability Analyzer Based on Hidden Markov Model

    Hao BAI  Chang-zhen HU  Gang ZHANG  Xiao-chuan JING  Ning LI  

     
    LETTER-Dependable Computing

      Vol:
    E93-D No:12
      Page(s):
    3410-3413

    The letter proposes a novel binary vulnerability analyzer for executable programs that is based on the Hidden Markov Model. A vulnerability instruction library (VIL) is primarily constructed by collecting binary frames located by double precision analysis. Executable programs are then converted into structurized code sequences with the VIL. The code sequences are essentially context-sensitive, which can be modeled by Hidden Markov Model (HMM). Finally, the HMM based vulnerability analyzer is built to recognize potential vulnerabilities of executable programs. Experimental results show the proposed approach achieves lower false positive/negative rate than latest static analyzers.

  • Spectrum Handoff for Cognitive Radio Systems Based on Prediction Considering Cross-Layer Optimization

    Xiaoyu QIAO  Zhenhui TAN  Bo AI  Jiaying SONG  

     
    PAPER

      Vol:
    E93-B No:12
      Page(s):
    3274-3283

    The spectrum handoff problem for cognitive radio systems is considered in this paper. The secondary users (SUs) can only opportunistically access the spectrum holes, i.e. the frequency channels unoccupied by the primary users (PUs). As long as a PU appears, SUs have to vacate the channel to avoid interference to PUs and switch to another available channel. In this paper, a prediction-based spectrum handoff scheme is proposed to reduce the negative effect (both the interference to PUs and the service block of SUs) during the switching time. In the proposed scheme, a hidden Markov model is used to predict the occupancy of a frequency channel. By estimating the state of the model in the next time instant, we can predict whether the frequency channel will be occupied by PUs or not. As a cross-layer design, the spectrum sensing performance parameters false alarm probability and missing detection probability are taken into account to enhance accuracy of the channel occupancy prediction. The proposed scheme will react on the spectrum sensing algorithm parameters while the spectrum handoff performance is significantly affected by them. The interference to the PUs could be reduced obviously by adapting the proposed spectrum handoff scheme, associated with a potential increase of switch delay of SUs. It will also be helpful for SUs to save broadband scan time and prefer an appropriate objective channel so as to avoid service block. Numerical results demonstrate the above performance improvement by using this prediction-based scheme.

  • The Jiggle-Viterbi Algorithm for the RFID Reader Using Structured Data-Encoded Waveforms

    Yung-Yi WANG  Jiunn-Tsair CHEN  

     
    PAPER

      Vol:
    E93-A No:11
      Page(s):
    2108-2114

    Signals received at the interrogator of an RFID system always suffer from various kinds of channel deformation factors, such as the path loss of the wireless channel, insufficient channel bandwidth resulted from the multipath propagation, and the carrier frequency offset between tags and interrogators. In this paper we proposed a novel Viterbi-based algorithm for joint detection of data sequence and compensation of distorted signal waveform. With the assumption that the transmission clock is exactly synchronized at the reader, the proposed algorithm takes advantage of the structured data-encoded waveform to represent the modulation scheme of the RFID system as a trellis diagram and then the Viterbi algorithm is applicable to perform data sequence estimation. Furthermore, to compensate the distorted symbol waveform, the proposed Jiggle-Viterbi algorithm generates two substates, each corresponding to a variant structure waveform with adjustable temporal support, so that the symbol waveform deformation can be compensated and therefore yield a significant better performance in terms of bit error rate. Computer simulations shows that even in the presence of a moderate carrier frequency offset, the proposed approach can work out with an acceptable accuracy on data sequence detection.

  • Acoustic Model Adaptation for Speech Recognition

    Koichi SHINODA  

     
    INVITED PAPER

      Vol:
    E93-D No:9
      Page(s):
    2348-2362

    Statistical speech recognition using continuous-density hidden Markov models (CDHMMs) has yielded many practical applications. However, in general, mismatches between the training data and input data significantly degrade recognition accuracy. Various acoustic model adaptation techniques using a few input utterances have been employed to overcome this problem. In this article, we survey these adaptation techniques, including maximum a posteriori (MAP) estimation, maximum likelihood linear regression (MLLR), and eigenvoice. We also present a schematic view called the adaptation pyramid to illustrate how these methods relate to each other.

  • Efficient Parallel Learning of Hidden Markov Chain Models on SMPs

    Lei LI  Bin FU  Christos FALOUTSOS  

     
    INVITED PAPER

      Vol:
    E93-D No:6
      Page(s):
    1330-1342

    Quad-core cpus have been a common desktop configuration for today's office. The increasing number of processors on a single chip opens new opportunity for parallel computing. Our goal is to make use of the multi-core as well as multi-processor architectures to speed up large-scale data mining algorithms. In this paper, we present a general parallel learning framework, Cut-And-Stitch, for training hidden Markov chain models. Particularly, we propose two model-specific variants, CAS-LDS for learning linear dynamical systems (LDS) and CAS-HMM for learning hidden Markov models (HMM). Our main contribution is a novel method to handle the data dependencies due to the chain structure of hidden variables, so as to parallelize the EM-based parameter learning algorithm. We implement CAS-LDS and CAS-HMM using OpenMP on two supercomputers and a quad-core commercial desktop. The experimental results show that parallel algorithms using Cut-And-Stitch achieve comparable accuracy and almost linear speedups over the traditional serial version.

  • A VLSI Architecture for Output Probability Computations of HMM-Based Recognition Systems with Store-Based Block Parallel Processing

    Kazuhiro NAKAMURA  Masatoshi YAMAMOTO  Kazuyoshi TAKAGI  Naofumi TAKAGI  

     
    PAPER-VLSI Systems

      Vol:
    E93-D No:2
      Page(s):
    300-305

    In this paper, a fast and memory-efficient VLSI architecture for output probability computations of continuous Hidden Markov Models (HMMs) is presented. These computations are the most time-consuming part of HMM-based recognition systems. High-speed VLSI architectures with small registers and low-power dissipation are required for the development of mobile embedded systems with capable human interfaces. We demonstrate store-based block parallel processing (StoreBPP) for output probability computations and present a VLSI architecture that supports it. When the number of HMM states is adequate for accurate recognition, compared with conventional stream-based block parallel processing (StreamBPP) architectures, the proposed architecture requires fewer registers and processing elements and less processing time. The processing elements used in the StreamBPP architecture are identical to those used in the StoreBPP architecture. From a VLSI architectural viewpoint, a comparison shows the efficiency of the proposed architecture through efficient use of registers for storing input feature vectors and intermediate results during computation.

  • An Improved Encoder for Joint Source-Channel Decoder Using Conditional Entropy Constraint

    Moonseo PARK  Seong-Lyun KIM  

     
    LETTER-Fundamental Theories for Communications

      Vol:
    E92-B No:6
      Page(s):
    2222-2225

    When the joint source-channel (JSC) decoder is used for source coding over noisy channels, the JSC decoder may invent errors even though the received data is not corrupted by the channel noise, if the JSC decoder assumes the channel was noisy. A novel encoder algorithm has been recently proposed to improve the performance of the communications system under this situation. In this letter, we propose another algorithm based on conditional entropy-constrained vector quantizer to further improve the encoder. The algorithm proposed in this letter significantly improves the performance of the communications system when the hypothesized channel bit error rate is high.

  • Distinctive Phonetic Feature (DPF) Extraction Based on MLNs and Inhibition/Enhancement Network

    Mohammad Nurul HUDA  Hiroaki KAWASHIMA  Tsuneo NITTA  

     
    PAPER-Speech and Hearing

      Vol:
    E92-D No:4
      Page(s):
    671-680

    This paper describes a distinctive phonetic feature (DPF) extraction method for use in a phoneme recognition system; our method has a low computation cost. This method comprises three stages. The first stage uses two multilayer neural networks (MLNs): MLNLF-DPF, which maps continuous acoustic features, or local features (LFs), onto discrete DPF features, and MLNDyn, which constrains the DPF context at the phoneme boundaries. The second stage incorporates inhibition/enhancement (In/En) functionalities to discriminate whether the DPF dynamic patterns of trajectories are convex or concave, where convex patterns are enhanced and concave patterns are inhibited. The third stage decorrelates the DPF vectors using the Gram-Schmidt orthogonalization procedure before feeding them into a hidden Markov model (HMM)-based classifier. In an experiment on Japanese Newspaper Article Sentences (JNAS) utterances, the proposed feature extractor, which incorporates two MLNs and an In/En network, was found to provide a higher phoneme correct rate with fewer mixture components in the HMMs.

  • New Rotation-Invariant Texture Analysis Technique Using Radon Transform and Hidden Markov Models

    Abdul JALIL  Anwar MANZAR  Tanweer A. CHEEMA  Ijaz M. QURESHI  

     
    LETTER-Computer Graphics

      Vol:
    E91-D No:12
      Page(s):
    2906-2909

    A rotation invariant texture analysis technique is proposed with a novel combination of Radon Transform (RT) and Hidden Markov Models (HMM). Features of any texture are extracted during RT which due to its inherent property captures all the directional properties of a certain texture. HMMs are used for classification purpose. One HMM is trained for each texture on its feature vector which preserves the rotational invariance of feature vector in a more compact and useful form. Once all the HMMs have been trained, testing is done by picking any of these textures at any arbitrary orientation. The best percentage of correct classification (PCC) is above 98 % carried out on sixty texture of Brodatz album.

  • A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System

    Keiichiro OURA  Heiga ZEN  Yoshihiko NANKAKU  Akinobu LEE  Keiichi TOKUDA  

     
    PAPER-Speech and Hearing

      Vol:
    E91-D No:11
      Page(s):
    2693-2700

    In a hidden Markov model (HMM), state duration probabilities decrease exponentially with time, which fails to adequately represent the temporal structure of speech. One of the solutions to this problem is integrating state duration probability distributions explicitly into the HMM. This form is known as a hidden semi-Markov model (HSMM). However, though a number of attempts to use HSMMs in speech recognition systems have been proposed, they are not consistent because various approximations were used in both training and decoding. By avoiding these approximations using a generalized forward-backward algorithm, a context-dependent duration modeling technique and weighted finite-state transducers (WFSTs), we construct a fully consistent HSMM-based speech recognition system. In a speaker-dependent continuous speech recognition experiment, our system achieved about 9.1% relative error reduction over the corresponding HMM-based system.

  • HMM-Based Mask Estimation for a Speech Recognition Front-End Using Computational Auditory Scene Analysis

    Ji Hun PARK  Jae Sam YOON  Hong Kook KIM  

     
    LETTER-Speech and Hearing

      Vol:
    E91-D No:9
      Page(s):
    2360-2364

    In this paper, we propose a new mask estimation method for the computational auditory scene analysis (CASA) of speech using two microphones. The proposed method is based on a hidden Markov model (HMM) in order to incorporate an observation that the mask information should be correlated over contiguous analysis frames. In other words, HMM is used to estimate the mask information represented as the interaural time difference (ITD) and the interaural level difference (ILD) of two channel signals, and the estimated mask information is finally employed in the separation of desired speech from noisy speech. To show the effectiveness of the proposed mask estimation, we then compare the performance of the proposed method with that of a Gaussian kernel-based estimation method in terms of the performance of speech recognition. As a result, the proposed HMM-based mask estimation method provided an average word error rate reduction of 61.4% when compared with the Gaussian kernel-based mask estimation method.

  • Random Texture Defect Detection Using 1-D Hidden Markov Models Based on Local Binary Patterns

    Hadi HADIZADEH  Shahriar BARADARAN SHOKOUHI  

     
    PAPER

      Vol:
    E91-D No:7
      Page(s):
    1937-1945

    In this paper a novel method for the purpose of random texture defect detection using a collection of 1-D HMMs is presented. The sound textural content of a sample of training texture images is first encoded by a compressed LBP histogram and then the local patterns of the input training textures are learned, in a multiscale framework, through a series of HMMs according to the LBP codes which belong to each bin of this compressed LBP histogram. The hidden states of these HMMs at different scales are used as a texture descriptor that can model the normal behavior of the local texture units inside the training images. The optimal number of these HMMs (models) is determined in an unsupervised manner as a model selection problem. Finally, at the testing stage, the local patterns of the input test image are first predicted by the trained HMMs and a prediction error is calculated for each pixel position in order to obtain a defect map at each scale. The detection results are then merged by an inter-scale post fusion method for novelty detection. The proposed method is tested with a database of grayscale ceramic tile images.

  • View Invariant Human Action Recognition Based on Factorization and HMMs

    Xi LI  Kazuhiro FUKUI  

     
    PAPER

      Vol:
    E91-D No:7
      Page(s):
    1848-1854

    This paper addresses the problem of view invariant action recognition using 2D trajectories of landmark points on human body. It is a challenging task since for a specific action category, the 2D observations of different instances might be extremely different due to varying viewpoint and changes in speed. By assuming that the execution of an action can be approximated by dynamic linear combination of a set of basis shapes, a novel view invariant human action recognition method is proposed based on non-rigid matrix factorization and Hidden Markov Models (HMMs). We show that the low dimensional weight coefficients of basis shapes by measurement matrix non-rigid factorization contain the key information for action recognition regardless of the viewpoint changing. Based on the extracted discriminative features, the HMMs is used for temporal dynamic modeling and robust action classification. The proposed method is tested using real life sequences and promising performance is achieved.

  • Joint Blind Super-Resolution and Shadow Removing

    Jianping QIAO  Ju LIU  Yen-Wei CHEN  

     
    PAPER-Image Processing and Video Processing

      Vol:
    E90-D No:12
      Page(s):
    2060-2069

    Most learning-based super-resolution methods neglect the illumination problem. In this paper we propose a novel method to combine blind single-frame super-resolution and shadow removal into a single operation. Firstly, from the pattern recognition viewpoint, blur identification is considered as a classification problem. We describe three methods which are respectively based on Vector Quantization (VQ), Hidden Markov Model (HMM) and Support Vector Machines (SVM) to identify the blur parameter of the acquisition system from the compressed/uncompressed low-resolution image. Secondly, after blur identification, a super-resolution image is reconstructed by a learning-based method. In this method, Logarithmic-wavelet transform is defined for illumination-free feature extraction. Then an initial estimation is obtained based on the assumption that small patches in low-resolution space and patches in high-resolution space share a similar local manifold structure. The unknown high-resolution image is reconstructed by projecting the intermediate result into general reconstruction constraints. The proposed method simultaneously achieves blind single-frame super-resolution and image enhancement especially shadow removal. Experimental results demonstrate the effectiveness and robustness of our method.

  • Dynamic Bayesian Network Inversion for Robust Speech Recognition

    Lei XIE  Hongwu YANG  

     
    LETTER-Speech and Hearing

      Vol:
    E90-D No:7
      Page(s):
    1117-1120

    This paper presents an inversion algorithm for dynamic Bayesian networks towards robust speech recognition, namely DBNI, which is a generalization of hidden Markov model inversion (HMMI). As a dual procedure of expectation maximization (EM)-based model reestimation, DBNI finds the 'uncontaminated' speech by moving the input noisy speech to the Gaussian means under the maximum likelihood (ML) sense given the DBN models trained on clean speech. This algorithm can provide both the expressive advantage from DBN and the noise-removal feature from model inversion. Experiments on the Aurora 2.0 database show that the hidden feature model (a typical DBN for speech recognition) with the DBNI algorithm achieves superior performance in terms of word error rate reduction.

  • A Hidden Semi-Markov Model-Based Speech Synthesis System

    Heiga ZEN  Keiichi TOKUDA  Takashi MASUKO  Takao KOBAYASIH  Tadashi KITAMURA  

     
    PAPER-Speech and Hearing

      Vol:
    E90-D No:5
      Page(s):
    825-834

    A statistical speech synthesis system based on the hidden Markov model (HMM) was recently proposed. In this system, spectrum, excitation, and duration of speech are modeled simultaneously by context-dependent HMMs, and speech parameter vector sequences are generated from the HMMs themselves. This system defines a speech synthesis problem in a generative model framework and solves it based on the maximum likelihood (ML) criterion. However, there is an inconsistency: although state duration probability density functions (PDFs) are explicitly used in the synthesis part of the system, they have not been incorporated into its training part. This inconsistency can make the synthesized speech sound less natural. In this paper, we propose a statistical speech synthesis system based on a hidden semi-Markov model (HSMM), which can be viewed as an HMM with explicit state duration PDFs. The use of HSMMs can solve the above inconsistency because we can incorporate the state duration PDFs explicitly into both the synthesis and the training parts of the system. Subjective listening test results show that use of HSMMs improves the reported naturalness of synthesized speech.

  • State Duration Modeling for HMM-Based Speech Synthesis

    Heiga ZEN  Takashi MASUKO  Keiichi TOKUDA  Takayoshi YOSHIMURA  Takao KOBAYASIH  Tadashi KITAMURA  

     
    LETTER-Speech and Hearing

      Vol:
    E90-D No:3
      Page(s):
    692-693

    This paper describes the explicit modeling of a state duration's probability density function in HMM-based speech synthesis. We redefine, in a statistically correct manner, the probability of staying in a state for a time interval used to obtain the state duration PDF and demonstrate improvements in the duration of synthesized speech.

  • A Systolic FPGA Architecture of Two-Level Dynamic Programming for Connected Speech Recognition

    Yong KIM  Hong JEONG  

     
    PAPER-Speech and Hearing

      Vol:
    E90-D No:2
      Page(s):
    562-568

    In this paper, we present an efficient architecture for connected word recognition that can be implemented with field programmable gate array (FPGA). The architecture consists of newly derived two-level dynamic programming (TLDP) that use only bit addition and shift operations. The advantages of this architecture are the spatial efficiency to accommodate more words with limited space and the absence of multiplications to increase computational speed by reducing propagation delays. The architecture is highly regular, consisting of identical and simple processing elements with only nearest-neighbor communication, and external communication occurs with the end processing elements. In order to verify the proposed architecture, we have also designed and implemented it, prototyping with Xilinx FPGAs running at 33 MHz.

  • HHMM Based Recognition of Human Activity

    Daiki KAWANAKA  Takayuki OKATANI  Koichiro DEGUCHI  

     
    PAPER-Face, Gesture, and Action Recognition

      Vol:
    E89-D No:7
      Page(s):
    2180-2185

    In this paper, we present a method for recognition of human activity as a series of actions from an image sequence. The difficulty with the problem is that there is a chicken-egg dilemma that each action needs to be extracted in advance for its recognition but the precise extraction is only possible after the action is correctly identified. In order to solve this dilemma, we use as many models as actions of our interest, and test each model against a given sequence to find a matched model for each action occurring in the sequence. For each action, a model is designed so as to represent any activity containing the action. The hierarchical hidden Markov model (HHMM) is employed to represent the models, in which each model is composed of a submodel of the target action and submodels which can represent any action, and they are connected appropriately. Several experimental results are shown.

  • A Hybrid HMM/Kalman Filter for Tracking Hip Angle in Gait Cycle

    Liang DONG  Jiankang WU  Xiaoming BAO  

     
    LETTER-Biological Engineering

      Vol:
    E89-D No:7
      Page(s):
    2319-2323

    Movement of the thighs is an important factor for studying gait cycle. In this paper, a hybrid hidden Markov model (HMM)/Kalman filter (KF) scheme is proposed to track the hip angle during gait cycles. Within such a framework, HMM and KF work in parallel to estimate the hip angle and detect major gait events. This approach has been applied to study gait features of different subjects and compared with video based approach. Experimental results indicate that 1.) the swing angle of the hip can be detected with simple hardware configuration using biaxial accelerometers and 2.) the hip angle can be tracked for different subjects within the error range of -5°+5°.

21-40hit(71hit)